Analysis and Improved Recognition of Protein Names Using Transductive SVM

نویسندگان

  • Masaki Murata
  • Tomohiro Mitsumori
  • Kouichi Doi
چکیده

We first analyzed protein names using various dictionaries and databases and found five problems with protein names; i.e., the treatment of special characters, the treatment of homonyms, cases where the protein-name string may be a substring of a different protein-name string, cases where one protein exists in different organisms, and the treatment of modifiers. We confirmed that we could use a machine-learning approach to recognizing protein names to solve these problems. Thus, machine-learning methods have recently been used in research to recognize protein names. A classifier trained in a specific domain, however, can cause overfitting and be so inflexible that it can only be used in that domain. We therefore developed a new corpus on breast cancer and investigated the flexibility of classifiers trained on the GENIA [1] or the breast-cancer corpora. We used a transductive support vector machine (SVM) to avoid overfitting, and we evaluated the effect of transductive learning. We found that transductive SVM prevented overfitting in experiments and yielded higher accuracies than were obtained from the conventional SVM. The transductive SVM increased the F-scores (70.46 to 79.64 and 70.63 to 74.61) in our two experiments for the criterion of “Sub” that we define in this paper.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Method of Transductive SVM-Based Network Intrusion Detection

Based on the existing Transductive SVM and via introducing smooth function ( , ) P   to construct smooth cored unconstrained optimization problem, this article will build the optimization model accessible to degenerate solutions to generate an improved transductive SVM, introduce simulated annealing to degenerate the optimization problem, and apply such a Support Vector Classifier to generate...

متن کامل

A novel classification technique based on progressive transductive SVM learning

The existing semisupervised techniques based on progressive transductive support vector machine (PTSVM) iteratively select transductive samples that are closest to the SVM margin bounds. This may result in selecting wrong patterns (i.e., patterns that when included in the semisupervised learning can be associated with a wrong label) as transductive samples, especially when poor initial training...

متن کامل

Object Recognition based on Local Steering Kernel and SVM

The proposed method is to recognize objects based on application of Local Steering Kernels (LSK) as Descriptors to the image patches. In order to represent the local properties of the images, patch is to be extracted where the variations occur in an image. To find the interest point, Wavelet based Salient Point detector is used. Local Steering Kernel is then applied to the resultant pixels, in ...

متن کامل

A COMPARATIVE ANALYSIS OF WAVELET-BASED FEMG SIGNAL DENOISING WITH THRESHOLD FUNCTIONS AND FACIAL EXPRESSION CLASSIFICATION USING SVM AND LSSVM

This work presents a technique for the analysis of Facial Electromyogram signal activities to classify five different facial expressions for Computer-Muscle Interfacing applications. Facial Electromyogram (FEMG) is a technique for recording the asynchronous activation of neuronal inside the face muscles with non-invasive electrodes. FEMG pattern recognition is a difficult task for the researche...

متن کامل

Face Recognition using Eigenfaces , PCA and Supprot Vector Machines

This paper is based on a combination of the principal component analysis (PCA), eigenface and support vector machines. Using N-fold method and with respect to the value of N, any person’s face images are divided into two sections. As a result, vectors of training features and test features are obtain ed. Classification precision and accuracy was examined with three different types of kernel and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JCP

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2008